Somatic copy number alteration and fragmentation analysis in circulating tumor DNA for cancer screening and treatment monitoring in colorectal cancer patients

Background Analysis of circulating free DNA (cfDNA) is a promising tool for personalized management of colorectal cancer (CRC) patients. Untargeted cfDNA analysis using whole-genome sequencing (WGS) does not need a priori knowledge of the patient´s mutation profile. Methods Here we established LIquid biopsy Fragmentation, Epigenetic signature and Copy Number Alteration analysis (LIFE-CNA) using WGS with ~ 6× coverage for detection of circulating tumor DNA (ctDNA) in CRC patients as a marker for CRC detection and monitoring. Results We describe the analytical validity and a clinical proof-of-concept of LIFE-CNA using a total of 259 plasma samples collected from 50 patients with stage I-IV CRC and 61 healthy controls. To reliably distinguish CRC patients from healthy controls, we determined cutoffs for the detection of ctDNA based on global and regional cfDNA fragmentation patterns, transcriptionally active chromatin sites, and somatic copy number alterations. We further combined global and regional fragmentation pattern into a machine learning (ML) classifier to accurately predict ctDNA for cancer detection. By following individual patients throughout their course of disease, we show that LIFE-CNA enables the reliable prediction of response or resistance to treatment up to 3.5 months before commonly used CEA. Conclusion In summary, we developed and validated a sensitive and cost-effective method for untargeted ctDNA detection at diagnosis as well as for treatment monitoring of all CRC patients based on genetic as well as non-genetic tumor-specific cfDNA features. Thus, once sensitivity and specificity have been externally validated, LIFE-CNA has the potential to be implemented into clinical practice. To the best of our knowledge, this is the first study to consider multiple genetic and non-genetic cfDNA features in combination with ML classifiers and to evaluate their potential in both cancer detection and treatment monitoring. Trial registration DRKS00012890. Supplementary Information The online version contains supplementary material available at 10.1186/s13045-022-01342-z.

(RT) and was transferred to a new 15 ml reaction tube, followed by two centrifugations at 16,000 x g for 10 min at 4°C. 500 µl of plasma were transferred to a separate 1.5 µl reaction tube for CEA analysis. Plasma was stored at -80°C until further sample processing.

cfDNA extraction
Cell-free DNA from 2 to 7.5 ml plasma was isolated using the QIAamp circulating nucleic acid kit (Qiagen, Hilden, Germany, #Cat 55114). All buffer volumes were adjusted to the respective plasma volumes. All membrane washing steps were performed twice. cfDNA concentration was quantified using the High Sensitivity NGS Fragment Analysis Kit (Agilent, Santa Clara, California, USA, #DNF-474-0500) on the

Droplet Digital PCR
Droplet Digital PCR (ddPCR) was performed using single Probe ddPCR BRAF p.V600E assay (Bio-Rad, Hercules, California, USA, #dHsaMDV2010027) and KRAS p.G12/p.G13 screening kit (Bio-Rad, #1863506) on the QX200 system (Bio-Rad) according to the manufacturer's instructions (see also Supplementary Methods). DNA was mixed with 10 µl of ddPCR™ Supermix for Probes (no dUTPs) (Bio-Rad, #Cat 1863023) and 1 µl of the primer/probe mixture. For gDNA from whole blood, cell lines or tumor specimens, 5 µl of extracted DNA at a concentration of 4-6 ng/µl were used in single reactions. For cfDNA from plasma samples, 1 µl, 5 µl or 8 µl of extracted DNA at concentrations of >15 ng/µl, between 2 ng/µl and 15 ng/µl, and between 0.6 ng/µl and 2 ng/µl respectively, were used in three replicates. 70 µl of Droplet Generation The LOB and LOQ of ddPCR assays were established in accordance to the "Protocols for determination of limits of detection and limits of quantitation" [1] as previously described [2]. For determination of the LOB at least 60 healthy controls were measured to establish the cutoff for ctDNA detection with ≥95% specificity. For determination of the LOQ at least 40 replicates with the targeted variant present in a known VAF were measured to assess the assay specific dispersion. Samples with VAFs > LOB (limit of blank) are defined with ctDNA positive status, and samples with VAFs > LOQ (limit of quantification) harbor quantifiable ctDNA VAFs.

CEA analysis
CEA levels were determined in plasma samples collected for cfDNA isolation using the Human CEA ELISA Kit (Biorbyt, Cambridge, UK, Cat# orb438561) according to manufacturer´s instructions.

Library preparation and sequencing
Whole-genome sequencing of cfDNA and gDNA samples was performed using the NEBNext® Ultra™ DNA

Somatic copy number alterations
Sex-and chromosome-specific cutoffs for the identification of somatic copy number gains (positive LOB) and losses (negative LOB) were established based on the distribution of log2 ratios in each bin of 55 healthy controls for 80% confidence intervals (CIs) (Supplementary Equation 3). Log2 ratios, deviating more than four times the standard deviation from the mean, were removed from this analysis.

Supplementary Equation 3 Determination of the LOB with 80% CI using a parametric approach
µ : Mean of healthy control log2 ratios; : standard deviation of healthy control log2 ratios = µ ± 1.282 Segments with log2 ratios above the positive LOB or below the negative LOB were identified as gain or loss, respectively.

Details to machine learning model for tumor detection
Machine learning classifiers were build based on 134 samples of 50 CRC patients with clinically evident tumor burden and 63 samples of negative controls. In detail, negative controls consisted of 55 healthy controls and 8 samples collected from patients more than 6 weeks in remission.
For performance evaluation predictions of the test set with the best model of each iteration were stored for receiver operating characteristic (ROC) analysis. The ROC curve and its area under the curve (AUC) were calculated by averaging over the 100 individual curves. To combine the models obtained for the different feature sets, the meta-learner published by Peneder et al. in 2021 [5] was used.
We evaluated the performance of ML classifiers on the following feature sets: (i) Global fragmentation as described by the proportion of fragments with the following lengths: 100 to 150 bp, 160 to 180 bp, 180 to 220 bp, 250 to 320 bp, the ratio of 100 to 150 bp by 163 to 169 bp, and the ratio of 160 to 180 bp by 180 to 220 bp. Also the amplitude at 10 bp was included [5,6]. (ii) Regional fragmentation in terms of read depth of short (90 to 150 bp) and long (151 to 220 bp) fragments, and log2 transformed S/L ratio in 639 5 Mb bins as well as z-scored and log2 transformed S/L ratio over complete chromosome arms (Supplementary Table 4) [5,7].  Table   5). Following downsampling, the bam files of CRC patients and healthy controls were merged using samtools merge (v.1.10).

In silico dilutions
6

Study cohort
Supplementary

Coverage in CRC-specific active chromatin
Using the LIQUORICE tool [5], we analyzed the coverage in six CRC-specific active chromatin region sets.
We compared samples collected from CRC patients at various time points of disease to healthy controls (n=55). Significantly stronger coverage drops could be observed for all region sets in samples collected from CRC patients at diagnosis, during therapy, with stable disease, and progressive disease.

Comparison of ML classifiers to hotspot assays
We further observed that the integrated classifiers detected ctDNA in all samples collected from patients with clinically evident tumor burden with high sensitivity. These finding could be supported by more sensitive predictions of ctDNA throughout treatment compared to targeted hotspot assays. While the classifier based on global fragmentation detected ctDNA in 12/30 samples, and the classifiers based on regional fragmentation and the meta-learner detected ctDNA in 100% (30/30) of samples, targeted hotspot assays detected ctDNA in only 13/30 samples from CRC patients with clinically evident tumor burden. However, in contrast to hotspot assays these classifiers are not informative for the detection of response or resistance to treatment as they only identify the presence of ctDNA but cannot support ctDNA quantification. (Figure 5, Supplementary Figures 6, 9, 10, 13, and 14).

SCNA-based tumor detection using ichorCNA is not suitable for ctDNA detection in CRC patients
To identify and quantify tumor content, we initially applied the ichorCNA tool [3]. We tested whether ichorCNA can reliably discriminate CRC patients across all stages from healthy controls based on tumor content estimation. To test specificity of this approach, we first analyzed 55 healthy controls. We 14 Disease monitoring

LB-CRC-07
Patient LB-CRC-07 was diagnosed with UICC stage II MSI CRC. BRAF p.V600E somatic variant was identified in tumor tissue. About eight months after R0 resection a singular liver metastasis was identified, which was surgically removed. Around seven months after surgical removal of the liver metastasis systemic nodal progression was diagnosed. Two weeks after diagnosis of systemic nodal progression chemotherapy was initiated. 1.5 months after initiation of chemotherapy partial remission was clinically confirmed. Another 4 months later stable disease was diagnosed. Throughout all following stagings disease remained stable and no progression was detected. Around 11 months after initiation of chemotherapy treatment was changed to checkpoint inhibitor therapy. Disease remained stable over the following 10 months, until the patient dropped out of the study, three years after initial diagnosis.

LB-CRC-08
Patient LB-CRC-08 was diagnosed with UICC stage IVC MSS CRC with metastasis in peritoneum and lung. 18

LB-CRC-29
Patient LB-CRC-29 was diagnosed with UICC stage IIIB CRC. The patient was treated with primary surgery followed by adjuvant chemotherapy. Supplementary

LB-CRC-32
Patient LB-CRC-32 was diagnosed with UICC stage IV MSS CRC with liver metastasis. BRAF p.V600E somatic variant was identified in tumor tissue. Six months after initiation of palliative chemotherapy partial remission was identified. Around 4.5 months later the patient presented with stable disease. Another six months later progressive disease was detected. 20

LB-CRC-34
Patient LB-CRC-34 was diagnosed with UICC stage IV MSS CRC with metastasis in liver and lung. Two months after initiation of palliative chemotherapy partial remission was detected. Another three months later stable disease was identified and treatment was paused. After 1.5 months of paused treatment maintenance therapy was initiated. Two month later progressive disease was diagnosed and treatment was changed. This change of treatment did not improve the patient's condition and the patient died around one year after initial diagnosis.

LB-CRC-35
Patient LB-CRC-35 was diagnosed with UICC stage III MSS CRC. The patient was treated for two months with neoadjuvant radiochemotherapy. Two months after neoadjuvant treatment the patient received surgery. One month after surgery the patient was treated for two months with adjuvant radiochemotherapy. Following adjuvant chemotherapy no clinically evident tumor was observed over four months of follow-up. 22

LB-CRC-38
Patient LB-CRC-38 was diagnosed with UICC stage IV CRC with peritonealcarcinosis. One month after primary surgery palliative chemotherapy was initiated. Three months after initiation of palliative chemotherapy stable disease was observed. 23

LB-CRC-43
Patient LB-CRC-43 was diagnosed with UICC stage IVA MSS CRC with metastasis in liver and lung. The patient was treated with initial palliative chemotherapy for two months. Following chemotherapy partial remission was identified. One month after completion of initial chemotherapy liver metastasis were resected followed by radiation of the colon carcinoma over one month. Following radiation the colon carcinoma was surgically removed. Around 4 months after rectum resection progressive disease was detected. Therefore the patient was treated with a second chemotherapy over four months. Two months after initiation of the second chemotherapy, stable disease was detected.

LB-CRC-47
Patient LB-CRC-47 was diagnosed with UICC stage III CRC. The patient was treated with primary surgery followed by adjuvant chemotherapy for five months around one month later. Following adjuvant chemotherapy the patient was in remission.

LB-CRC-48
Patient LB-CRC-48 was diagnosed with UICC stage IV MSI CRC with metastasis in liver and lung. The patient was treated with palliative chemotherapy. Already one week after initiation of chemotherapy progressive disease was identified and the patient received radiation for two weeks. Chemotherapy was discontinued two weeks after initiation. Following identification of still ongoing progressive disease one week after completion of radiation immunotherapy was initiated. The patient died one month later. 27

LB-CRC-52
Patient LB-CRC-52 was diagnosed with UICC stage IIIC MSS CRC. The patient was treated with primary surgery followed by adjuvant chemotherapy for five months around one month later.
Supplementary Figure 17 Monitoring of cfDNA features analyzed with LIFE-CNA, CEA levels and cfDNA concentration in patient LB-CRC-52 stage IIIC.